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We apply the objective method of Aldous to the problem of find- 
ing the minimum cost edge-cover of the complete graph with random 
independent and identically distributed edge-costs. The limit, as the 
number of vertices goes to infinity, of the expected minimum cost 
for this problem is known via a combinatorial approach of Hessler 
and Wastlund. We provide a proof of this result using the machin- 
ery of the objective method and local weak convergence, which was 
used to prove the (^(2) limit of the random assignment problem. 
A proof via the objective method is useful because it provides us 
more information on the nature of the edges incident on a typical 
root in the minimum cost edge cover. We further show that a belief 
propagation algorithm converges asymptotically to the optimal solu- 
tion. This finds application in a computational linguistics problem 
of semantic projection. The belief propagation algorithm yields a 
near optimal solution with lesser complexity than the known best 
algorithms designed for optimality in worst-case settings. 

1. Introduction. Suppose that we are given a graph G with vertex 
set V and edge set E, denoted G = {V,E). Each edge e G E has a weight 

£ R-f- Alternatively, we are given a bipartite graph with a vertex set 
F = Vi U V2, a union of two disjoint vertex subsets, and an edge set E C 
Vi X V2. An edge-cover for the graph is a subset of edges that hits (covers) 
every vertex. The cost of an edge-cover is the sum of the weights of edges 
in the cover. Our interest in this paper is on minimum cost edge-covers 
on the complete graph (denoted Kn when \V\ = n) and on the complete 
bipartite graph (denoted Kn,n when | Vi | = 1 1^2 1 = n), when the edge weights 
are independent random variables, each with the exponential distribution of 
mean 1. 
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The following example on a bipartite graph illustrates how minimum cost 
edge-covers arise in practice. 

An example of semantic projection. Computational linguists have re- 
cently been interested in machine-based natural language processing. These 
include part-of-speech tagging, parsing, and at a higher level, semantic role 
parsing [12] which, for example, would enable an automatic recognition that 
the sentences "Mary sold the book to John" and "The book was sold by 
Mary to John" have the same semantic roles. (This example is taken from 
Wikipedia [19].) Currently, English is blessed with the availability of a large 
amount of annotated texts as training data while most others languages lack 
this advantage. Semantic projection exploits the availability of (1) parallel 
corpora of translated texts and (2) higher quality parsing tools in one 
language in order to transfer annotations from the resource-rich language 
to the other. 

Pado and Lapata [12] provide one method to do this where a minimum 
cost edge-cover naturally arises. The source and target sentences in the two 
languages are first broken into linguistic units to yield sets Vi and V2 of the 
respective linguistic units. These linguistic units are then viewed as vertices 
of a complete bipartite graph. Let R be some finite set of semantic roles, 
which can be viewed for our purposes as abstract annotations. The parsing 
tool on the source side is used to find a semantic role assignment rolei : 
R — )• 2^1 , where the subscript refers to the source language. A dissimilarity 
measure based on linguistic considerations is then assigned to every pair of 
linguistic units across the languages, and is denoted ^ : Vi x V2 — >■ R+. A 
decision procedure uses these dissimilarity scores to find a subset C C V1XV2 
of semantically aligned units. Pado and Lapata [12] argue that a minimum 
cost edge-cover is a good choice for this semantic alignment. It allows a 
linguistic unit in one language (an element of say V2) to map to several 
units in the other language (a subset of Vi), and vice- versa. For example, 
the linguistic units "to be on time" and "punctual" (English) could both be 
mapped with small but possibly different dissimilarity scores to "piinktlich" 
(German), and both edges may be picked by a good candidate edge-cover. 
The covering property of the edge-cover enables all source and target vertices 
to participate, and thus has the potential to capture important connections 
between linguistic units which may otherwise be missed. The minimum cost 
property attempts to provide an economical semantic alignment, and further 
captures global alignments as compared to previously proposed local decision 
procedures. Once the minimum cost edge-cover is found by the decision 
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procedure, semantic roles are then assigned on the target side as 

role2(r) = {j \ there is an i G rolei(r) such that (i, j) G C}. 

Pado and Lapata [12] compare the goodness of their decision procedures 
based on minimum cost edge-cover (and perfect matching) with some other 
prior approaches on a data set of about 1000 sentences. Real data sets 
are of course much larger. The resulting graph, when restricted to edges 
of small weight (i.e., edges signifying low dissimilarity and therefore good 
correspondence), can be modeled as a large but sparse random graph. If 
|Vi| = 0(|V2|) = n, algorithms used by Pado and Lapata [12] to find the 
minimum cost edge-cover take O(n^) operations, in the worst case. 

The actual results of the Pado and Lapata [12] experiments need not 
concern us here. For a list of challenges that arise in the implementation of 
the above approach and methods to address them, we refer the linguistically 
inclined reader to [13] and references therein. What we shall take with us as 
we move forward are the observations that (1) edge-covers arise in practice 
on large graphs that can be modeled by sparse random graphs, and (2) 
algorithmic simplifications that reduce complexity are of practical value. 

We shall for simplicity focus on minimum cost edge-covers on the complete 
graph Kn on n vertices. All our results carry over to Kn^n with only scaling 
factor modifications. Recall that the edge capacities are independent, each 
edge having the exponential distribution with mean 1. This is a typical 
mean-field model which captures sparsity of the graph depicting linguistic 
units and associated edges in the above example, but ignores correlations 
among edge weights. See Section 11 for another geometric setting where 
the same mean field models arise. Let C„ be the cost of the minimum cost 
edge-cover of Kn- We prove that the expected value of C„ converges to the 
constant W{l) + W{lf/2, which is approximately 0.728. (The function W{-) 
is Lambert's PF-function, which is the inverse of / : [0, oo) — )• [0, oo), f{x) = 
xe^; W{1) ~ 0.567.) Further, and more importantly from an application 
perspective, we show that a belief propagation algorithm can be used to 
find asymptotically optimal edge-covers in 0{'n?) steps. The results, with 
only scaling factor changes, hold for the complete bipartite graphs Kn^n- 

The result regarding the limit on JCn.n has been proved before by Hessler 
and Wastlund in [10] using a combinatorial approach. A proof based on a 
game formulation is contained in [16]. We discuss these works at the end of 
this introduction. Our focus in this article is on using the objective method 
for this problem and on devising a belief propagation algorithm. 

The roots of the objective method lie in Aldous's 1992 paper [1] on the 
assignment problem. The problem of finding the minimum cost matching on 
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the complete bipartite graph with independent and identically distributed 
edge-costs, termed as the random assignment problem in literature, inspired 
a series of works in combinatorial probability. Mezard and Parisi [11], using 
the cavity method of statistical physics, conjectured in 1987 that the expected 
minimum cost for the random assignment problem on the bipartite graph 
Kn,n converges to C(2) = Yl'h=i S°^^ infinity. This was proved 

rigorously by Aldous [4] in 2001 by extending the proof of existence of the 
limit contained in [1]. Several other proofs have been provided for the limit 
in subsequent works. 

In [4], Aldous related the problem on Kn,n to one on a suitable limit 
object. Several calculations become easier on the limit object. In this case, 
the limit is a tree, the so-called Poisson weighted infinite tree or PWIT, 
with many useful symmetries. Aldous used these symmetries to construct a 
distributional identity, that then served as a guide for solving the random 
assignment problem rigorously. With this approach, Aldous showed that the 
following quantities converge to the corresponding quantities on the limit 
object: 

• the expected cost of optimal matching on Kn,n'i 

• the distribution of the cost of the matching edge incident on a typical 
node of Kn,n; 

• the probability that the matching edge incident on a typical node of 
Kn^n is the k-th smallest of all the edges incident on it. 

It turns out that the limit object, and hence the answers, remain the same 
for problems on the complete bipartite graph Kn^n and on the complete 
graph Kn- One dividend of a proof via the objective method is that we have 
answers to several ancillary questions such as the second and third bullets 
above. The ability of the objective method to provide these auxiliary results 
motivates us to solve the problem of optimal edge-cover via the objective 
method. 

From an algorithms perspective, the cavity equations suggest a natural 
iterative decentralized message passing algorithm, some versions of which are 
commonly called belief propagation (BP) in the computer science literature. 
For many combinatorial optimization problems, a BP algorithm can be set 
up to converge to the correct solution on graphs without cycles. Bayati, 
Shah and Sharma [7] proved that the BP algorithm for maximum weight 
matching on bipartite graphs converges to the correct value as long as 
the maximum weight matching is unique. Salez and Shah [14] studied the 
random assignment problem and proved a tighter connection with the limit 
object. They showed that that a BP algorithm on Kn,n converges to an 
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update rule on the limit PWIT of [4]. The iterates on the limit graph 
converge in distribution to the minimum cost assignment. The iterates are 
near the optimal solution in O(n^) steps whereas the worst case optimal 
algorithm on bipartite graphs is O(n^) (expected time 0(n'^ log n) for i.i.d. 
edge capacities); see Salez and Shah [14] and references therein. We show a 
similar complexity improvement for the edge-cover problem. 

The objective method is quite powerful to be applicable to several com- 
binatorial probability problems. See Aldous and Steele [3] for a survey. 
Aldous and Bandopadhyay [5, Sec. 7.5] outline the steps of Aldous's pro- 
gram to establish the validity of the cavity method, which we quote in 
Section 11. However, each problem requires specific proofs, and we are still 
far from a complete theory applicable to a wide class of problems. The 
edge-cover problem itself poses some modest problem-specific challenges 
which we overcome in this paper. These include (1) a proof of existence 
and uniqueness of a solution to the distributional identity associated with 
the edge-cover problem, (2) a proof of a property called endogeny of a 
process on the tree associated with the distributional identity, (3) a proof 
of optimality of the edge-cover selection on the PWIT as suggested by 
the distributional identity, and eventually (4) a proof that a BP algorithm 
converges to an asymptotically optimal edge-cover on the random complete 
graph. See Section 11 for a more detailed summary. 

Before we end this introduction, we would like to mention two other 
approaches that have been used to solve related combinatorial optimiza- 
tion problems, in particular, matching, edge-cover and travelling salesman 
problems. One approach used by Wastlund in [16, 18] calls for a "boundary 
conditioning" parameter to study "diluted" versions of the optimization 
problems, eventually driving the parameter to infinity, and thereby relating 
the resulting limiting problem with the undiluted versions. For example, 
in the matching case, diluted matching is a partial matching with each 
unmatched vertex paying a cost equal to the parameter. Wastlund then 
formulates the optimization problem in terms of a game played on the 
graph. A second and more combinatorial approach is used by Wastlund 
in [17] for matching and TSP, and in [10] for the edge-cover problem. These 
works study the respective optimization problems as certain flow problems 
on bipartite graphs. The feasible solutions to these flow problems have a 
fixed number of edges k. A recursive relation on k is obtained for the cost 
of the optimal solution. As our focus is on the objective method, we do not 
dwell any more on these approaches. 
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2. Main results. Our first result establishes the limit of the expected 
minimum cost of the random edge-cover problem. 

Theorem 1. On Kn, we have 

WHY 

(1) lim EC„ = VF(1) + ^-^. 

Our second result shows that a belief propagation algorithm gives an edge- 
cover that is asymptotically optimal as n — )• c«. We will use the result that 
the update rule of BP converges to an update rule on a limit infinite tree. For 
this we define the BP algorithm on an arbitrary graph G = {V, E) with edge- 
costs. For an edge e = {v,w} £ we write its cost as (e) or {"Vjw). 
For each vertex v G V, we associate a nonempty subset of its neighbours 
7rQ{v). By taking a union of all edges of the form {v, w}, w G tTq{v), we get 
an edge-cover of G which we will denote by C{ttq). 

The BP algorithm is an iterative message passing algorithm. In each 
iteration A; > 0, every vertex v G V sends a message Xq {w, v) to each 
neighbour w ^ v according to the following rules: 

Initialization: 

(2) X^ai^,v) = 
Update rule: 

(3) X^+'{w,v)= min \ Ug iv,u) - X^ iv,u)Y] 
Decision rule: 

(4) 7r!^{v) = arg min i Ug {v,u) - X% {v, u 

(5) Edge cover = C{ttq{v)) 

We analyze the belief propagation algorithm for G = Kn and i.i.d. expo- 
nential random edge-costs, and prove that after sufficiently large number of 
iterates, the expected cost of the assignment given by the BP algorithm is 
close to the limit value in Theorem 1. 

Theorem 2. On Kn, we have 



(6) lim lim E 

fc^oo n—>-co 



W{1) + 
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The formal statements on the bipartite complete graph Kn,n with i.i.d. 
expontial distribution of mean 1 are the following and are stated without 
proof. 

Theorem 3. On Kn,n, we have 
(7) lim ECn = 2W{1) + W{lf. 

n— >oo 

Theorem 4. On Knn, we have 



(8) hm lim E 



2W{l) + W{lf 



3. Local weak convergence. In this section, we recollect the termi- 
nology for defining convergence of graphs. 

3.1. Rooted geometric networks. A graph G = (1/, i?) along with a length 
function I : E ^ {0, oo] is called a network. The distance between two vertices 
in the network is the infimum of the sum of lengths of the edges of a path 
connecting the two vertices, the infimum being taken over all such paths. We 
call the network a geometric network if for each vertex v £ V and positive 
real p, the number of vertices within a distance /o of u is finite. We denote 
the space of geometric networks by Q. 

A geometric network with a distinguished vertex v is called a rooted 
geometric network with root v. We denote the space of all connected rooted 
geometric networks by Q^:. In we do not distinguish between rooted 
isomorphisms of the same network. We will use the notation (G, o) to denote 
an element of which is the isomorphism class of rooted networks with 
underlying network G and root o. 

3.2. Local weak convergence. We call a positive real number p a conti- 
nuity point of G if no vertex of G is exactly at a distance p from the root 
of G. Let Mp{G) denote the neighbourhood of the root of G up to distance 
p. Np{G) contains all vertices of G which are within a distance p from the 
root of G (Figure 1). We take Mp{G) to be an element of Q^, by inheriting 
the same length function / as G, and the same root as that of G. 

We say that a sequence of rooted geometric networks G^, n > 1 converges 
locally to an element Goo in Q* if for each continuity point p of Goo, there 
is an Up such that for all n > Up, there exists a graph isomorphism ^n,p 
from Mp{Goa) to Mp{Gn) that maps the root of the former to the root of the 
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latter, and for each edge e of Mp{Goo), the length of 7n,p(e) converges to the 
length of e as n — 7- oo. 

The space can be suitably metrized to make it a separable and complete 
metric space. One can then consider probability measures on this space and 
endow that space with the topology of weak convergence of measures. This 
notion of convergence is called local weak convergence. 

In our setting of complete graphs K„ = {Vn, En) with random i.i.d. edge- 
costs {S,e,e E En}, we regard the edge-costs to be the lengths of the edges, 
and declare a vertex of Kn chosen uniformly at random as the root of Kn- 
This makes Kn along with its root a random element of We rescale 
the edge-costs such that for each n, {^e, e G En} are i.i.d. random variables 
with mean n exponential distribution. We will denote this random, rooted, 
rescaled version of the n- vertex complete graph by Kn to distinguish it from 
the Kn defined earlier. Theorem 5 stated below (from [1, Aldous 1992]) says 
that the sequence of random geometric networks Kn converges in the local 
weak sense to an element of ^* called the Poisson weighted infinite tree 
(PWIT). 

3.3. Poisson weighted infinite tree. We use the notation from [14] to 
define the PWIT. 

Denote by V the set of all finite words over the alphabet N = {1,2,3,...}. 
Let (p denote the empty string and "." the concatenation operator. For any 
V G V write \v\ for the length of string v, and \i v ^ (j) write v for the string 
obtained by removing the last letter of v. 

Construct an undirected graph T = (V,<?) on V with the edge set 

£ = {{v,v.i} ,v G V,i G N} . 

Set (p to be the root of T. Then T is an infinite rooted tree with each vertex 
having a countably infinite number of children. Construct a family of inde- 
pendent Poisson processes of intensity 1 on R+: = {£,iii2->- • O)^ ^ 




Fig 1. Neighbourhood NpiG) of graph G. The solid edges form the neighbourhood, and 
form paths of length at most p from the root v. Dashed edges are the other edges of G. 
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Fig 2. PWIT T up to depth 2, with only the first three children of each vertex shown. 

Assign to each edge {v, v.i} in £ the length T is then a random element 
of and we call it the Poisson weighted infinite tree (PWIT) (Figure 2). 

Theorem 5 (Aldous 1992). The sequence of uniformly rooted random 
networks Kn converges to the PWITT as n ^ oo in the sense of local weak 
convergence. 

A similar result was earlier established by Hajek [9, Sec. IV] for a class 
of sparse Erdos-Renyi random graphs. The above theorem says that if we 
look at an arbitrary large but fixed neighbourhood of the root of Kn, then 
for large n it looks like the corresponding neighbourhood of the root of T. 
This suggests that if boundary conditions can be ignored we may be able 
to relate optimal edge-covers on Kn with an appropriate edge-cover on T 
(to be precise, an optimal involution invariant edge-cover (Section 5) on 
the PWIT). Furthermore, the local neighbourhood of the root of Kn is a 
tree for large enough n (with high probability). So we may expect belief 
propagation on Kn to converge. Both the above observations are true in the 
matching case, the former was established in [1, 4], and the latter was shown 
in [14]. We now extend these ideas to prove similar results for the edge-cover 
problem. 

4. Recursive distributional equation. 

4.1. A heuristic recursion. The PWIT T is an infinite graph, and it is 
clear that any edge-cover on it must have infinite cost. So it does not make 
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Fig 3. PWIT T with the subtrees at node j. 



sense to talk about a minimum cost edge-cover on T- However, for a moment 
let us pretend to perform operations on the minimum cost as if it were a 
finite quantity. Write C(T) for this minimum cost, and define 

(9) D{T) = {C{T)-C{T\{<P})r, 

where C{T \ {(j)}) is the minimum cost of edge-cover on the subgraph of 
T obtained by removing the root. Note that DiT) denotes the difference 
between the minimum cost of edge-cover of T and the minimum cost of 
partial edge-cover of T where the root (p can be left uncovered. 

If j is a child of the root, let T-' denote the induced subgraph of T 
containing j and all its descendants, and view it as a rooted network with 
root j (Figure 3). Define D{T^ ) accordingly, and observe from the symmetry 
of T that \^D{T^),j > l} are i.i.d., and have the same distribution as D{'T). 
We give a heuristic argument that D{T) satisfies the following relation. 

(10) D{r) = minid - D{V))^ . 

We can write C(T\ {</>}) in terms of edge-covers on the subtrees ,j > I 

as 

(11) c{T\m = Y^c{V). 

Let us consider edge-covers in which the edges covering the root are incident 
on the vertices in a fixed subset A of the children of the root. The minimum 
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cost among such edge-covers can be written as 

+ uAn{C{V),C{V \ {j})}) + Yl ^(^*)- 
jeA ieN\A 

C{T) is the minimum of the above value taken over all nonempty A, i.e., 
(12) 



j£A jeN\yl 



Thus, we can write 



D{T) = (mm - iC{V) - C{V \ {j}))+)) 

To minimize the term within parentheses, we must include all those indices 
j for which the summand (^j* — D{T^)) is negative. If the terms are positive 
for all indices j, A must be the singleton where the minimum is attained 
among all indices. By then taking the positive part, equation (10) follows. 

Although D[T) and D{'T^) are not well defined quantities, we shall prove 
that there is a nonnegative random variable X and i.i.d. random variables 
Xj,j > 1 having the same distribution as X, such that 

(13) X = min(^,-X,)+, 

where > 1} are points of a Poisson process of rate 1 on R+, indepen- 

dent of {Xj,j > 1}. 

4.2. Recursive distributional equations and recursive tree processes. 
Equations of the form (13) are termed as recursive distributional equations 
in [5]. Specifically, if V{S) denotes the space of probability measures on a 
space 5, a recursive distributional equation (RDE) is a fixed-point equation 
on V{S) of the form 

(14) X^g{C;{Xj,l<j <N)), 

where Xj,j > 1 are i.i.d. S- valued random variables having the same distri- 
bution as X, and are independent of the pair (^, A^), ^ is a random variable on 
some space, and is a random variable on NU {+oo}. is a given S*- valued 
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function. A solution to the RDE is a common distribution of X,Xj,j > 1 
satisfying (14). 

We can use the relation (14) to construct a tree indexed stochastic process, 
say Xi,i G V, which is called a recursive tree process (RTP) [5]. Associate 
to each vertex i £ V, an independent copy (^i,A/,-) of the pair (C, A^), and 
require X,- to satisfy 



RDE (14), there exists a stationary RTP, i.e. each Xi is distributed as fi. 
Such a process is called an invariant RTP with marginal distribution 

4.3. Solution to the edge-cover RDE. 

Theorem 6. The unique solution to the RDE (I4) is the cdf F^, whose 
complementary cdf F^, is given by 



The function W above is Lambert's W -function, the inverse of f : R-)_ — )• 
R+,/(x) = xe^ . In particular, l^(l)e'^*-"'^^ = 1. 

Proof. Let be a solution to the RDE (13), and let F be its cdf. 
Take Xj,j > 1 i.i.d. with distribution /x. Then {(^j,Xj),j > 1} is a Poisson 
process on x R_|_ with intensity d2;d-F(x). For y G R-|_, 




(15) 




W{l)e~y ify>0 
1 ify<0. 




P (No point of {{^j,Xj)} in {(z, x) : z - x < y}) 




Writing F{t) = 1 — F{t), we have 




exp - 
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The unique c satisfying the above equation is c = This proves that F 



5. Unimodularity and involution invariance. In Section 3 we de- 
fined the space Q^^ as the set of connected rooted geometric networks. Now 
define as the space of connected geometric networks with an ordered pair 
of distinguished vertices. Again, we do not distinguish between isomorphisms 
in ^=K^,, and denote by {G,o,x) the isomorphism class of elements with 
underlying network G and distinguished vertex pair {v,o). We endow this 
space with the topology of local convergence in the same way as in , except 
that for the isomorphism between the local neighbourhoods of two graphs, 
we require that the distinguished ordered vertex pair of one graph maps to 
the distinguished pair of the other graph. There is a suitable metric for this 
convergence that makes a complete separable metric space. 

A probability measure ^ on Q^, is called unimodular if it satisfies the 
following for all Borel / : Q^^^, — )■ [0, co\ 



A measure ^ on Q^, that satisfies the above for all Borel / supported on 
{(G, ~ y} is said to be involution invariant. It is clear that the set 

of unimodular measures is a subset of the set of involution invariant mea- 
sures. Proposition 2.2 of [2] shows that involution invariance is equivalent 
to unimodularity. 

Involution invariance is characterized alternatively in [3] as follows. Given 
a measure on define a measure /i* on by letting its marginal measure 
on Q^, to be ^ and the conditional measure on the second vertex given a 
rooted geometric network G to be the counting measure on the neighbours 
of the root of G. Specifically, 



Then ^ is involution invariant if /i* is invariant under the involution trans- 
formation 



must be the cdf . 



□ 





I : — ;> G**, i{G, o, v) = (G, V, a). 
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Involution i swaps the order of the distinguished pair of vertices, leaving all 
else unchanged. 

We say that a random edge-cover C is involution invariant if the distri- 
bution of the component of the edge cover containing the root, which is a 
distribution on is involution invariant. 

In our model, the complete graphs are randomly rooted. Write C* for 
the component of the root of in the minimum cost edge-cover of Kn, with 
the same root as Kn- Then C* is a random element of Q-t, and by symmetry 
it is easy to see that its distribution is involution invariant. From Section 5.2 
of [3] , we see that involution invariance is preserved under weak limits in the 
metric space Q^. Consequently, if the sequence C*,n > 1 converges to an 
element C* in Q^,, then the distribution of C* will be involution invariant. 
This motivates us to study involution invariant edge covers on the limit 
PWIT. 

6. Optimal involution invariant edge-cover on the PWIT. 

6.1. A tree process based on the RDE. In the PWIT we split each undi- 
rected edge into two directed edges. For a general graph G, we use the 
notation ^(G) to denote the set of directed edges so obtained. If is the 
cost of the undirected edge e = {w, w}, we assign the same cost to both of the 
corresponding directed edges, and write the costs as ^ (n, f) = ^ {v,u) = ^e- 
To each directed edge ~t = {u,v) we will assign a random variable denoted 
by X{^) or X{u,v). Typically, X{u,v) will be different from X{v^u). The 
X process is constructed in the following lemma, which is an analogue of 
Lemma 5.8 of [3], and is proved similarly. We include the proof here for 
completeness. 

Lemma 1. There exists a process 

V, {ie,eeE{r)),{X{-t),^ e^{T)) 



where T is a PWIT with edge-lengths {^e, e G £'(?')} and G 
^(T)} is a stochastic process satisfying the following properties. 

(a) For each directed edge {u,v) € ^(T) 

(16) X{u,v) = min|(^(t;,'u;) : {v,w) € '^{T),wj^ n} . 

(b) If {u,v) G ^(T) is directed away from the root ofT, then X{u,v) has 
the distribution F^, as in (15). 
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(c) If {u,v) G E{T) the random variables X{u,v) and X{v,u) are inde- 
pendent. 

(d) For a fixed z > 0, conditional on the event that there exists an edge 
of length z at the root, say {(j),Vz}, the random variables X{(f),Vz) and 

0) are independent random variables each having the distribution 

Proof. Fix an integer d > 1. We create independent random variables 
from the distribution F^,, and assign one to each directed edge {v^w) of T 
where v is at depth d — 1, and w is at depth d from the root. Then if d > 1 use 
the relation (16) to recursively define random variables X{t, u), where t ^ u 
are vertices of T within depth d from the root. This generates a collection 
of random variables % whose joint distribution satisfies Properties (a), (b), 
and (c) in the statement of the Lemma for all vertices of T up to a depth d 
from the root. It is easy to see that the sequence of collections {'^d, d > 1} 
satisfies the conditions of Kolmogorov consistency theorem. So there exists 
a collection "^oo such that the restriction to random variables corresponding 
to vertices up to depth d is equal in distribution to the collection 'rfd for each 
d> 1. This implies that random variables in 'ifca satisfy the Properties (a), 
(b), and (c). 

To prove Property (d), observe that a Poisson process conditioned to 
have a point at z is also a Poisson process of the same intensity when that 
point is removed. Now conditional on the existence of the edge {(j),Vz} of 
length z, if we remove this edge the PWIT splits into two subtrees. Letting 
(j) and Vz to be the roots of these two subtrees, we find that the two subtrees 
are independent copies of the original PWIT T- From the construction in 
the previous paragraph, it is clear that conditionally the random variables 
X{(j),Vz) and X{vz,4') &re independent, and have the same distribution F*. 

□ 

6.2. An involution invariant edge-cover on the PWIT. We use the pro- 
cess {X{~^)} to construct an edge-cover Copt on T. 
For each vertex v of the PWIT, define a set 

(17) Copt(^;) = aTgmm{{^{v,y)-X{v,y)) + } . 

yr^v 

In words, include in Copt('v) all y ^ v such that ^(f,y) — X{v,y) < 0, 
and if there is no such y, then Copt(t') = {w} where vu is the unique (with 
probability 1) neighbour of v that minimizes ^ (u, •) — X(v,-). Alternatively, 

(18) Copt(w) = argmin<^ '^{£.{v,y) - X{v,y)) : AcN^^A nonempty L 
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Define the edge-cover to be 

Copt = IJ {{v, w} -.w £ Copt{v)} . 

V 

The fohowing lemma reassures us that the chosen edge cover does not 
include wasteful edges. 

Lemma 2. For any two vertices v,w ofT, we have 

vGCoptiw) S,{v,w) < X{v,w) + X{w,v) . 

As a consequence, 

V £ Copt(u^) <S=^ w G Copt(w). 

Proof. Suppose w £ Copt(v). If ^ {v, w) < X{v, w) then, since X{'w, v) > 

0, we have ^ {v, w) < X{v, w) + X{w, v). 

If ^ {v, w) > X{v, w), then the definition (17) of Copt(w) and w's member- 
ship to this set implies that w is the only element of 

argmin{(^ {v,y) - + } , 

yr^V 

1. e., 

C {v, w) - X{v, w) < {v, y) - X{v, y))+ for all y ~ y / w. 
Hence, 

C [v, w) - X{v, w) < min { {v, y) - X{v, y))+ : y ~ y / iz;} 
= X{w,v), 

where the last equality follows from (16). We have thus established one 
direction of the first statement, i.e., 

uj£Copt{v) =^ S,{v,w) < X{v,uj) + X{uj,v) . 

Conversely, suppose that ^ {v,w) < X{v,w) + X{w,v). Then X{'w,v) > 
^ {v,w) — X{v,w). Also X{w,v) > 0. Therefore 

X{w,v) > {av,w)-X{v,w))+ , 

i.e., 

min (e {v, y) - X{v, y))+ > (^ {v, w) - X{v, w))+ . 
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It follows that 

w G arg min {v, y) - X{v, y))+ , 

and hence w G Copt(t')- Thus, we have established the first statement of the 
lemma, which is 

wGCoptiv) <S=^ (,{v,w) < X{v,w) + X{w,v) . 

The condition on the right hand side above is symmetric in v,w, and hence 
the second statement of the lemma is proved. □ 

The following lemma asserts that the edge-cover Copt satisfies involution 
invariance. See Section 5 for definition. The proof is similar to the proof of 
Lemma 24 of [4] . 

Lemma 3. Copt is involution invariant. 

Proof. Given (,e, X{~^) G '^{T), the edge-cover Copt does not de- 
pend on the labels. The relation (16) for the X process is also independent 
of the labels of the vertices. The proof of the lemma is then complete by 
showing that the measure of the X process constructed in Lemma 1 is 
involution invariant. 

From the proof of Lemma 1 it is clear that the joint distribution of X 
process is determined by the property that for any d > 1 

{X(v, w) \v at depth d — 1 from the root, w at depth d from the root} 

are independent random variables with distribution F^,. We need to show 
that this property is invariant under the involution map. 

If (p is the root (first distinguished vertex) of T, and u ~ is the second 
distinguished vertex then, under the involution map, u becomes the root 
and (p the second distinguished vertex. Write Tu for the subtree containing 
u obtained by removing the edge {(j), u}. For an arbitrary Borel set B, define 
the event 

A := {{X{v, w) ,v at depth d — 1 from u, w at depth d from u) € B} 
The inverse image of A in the involution map is 

t~^{A) = {{X{vi,wi) ,vi G Tu,vi at depth d from (j), 
wi at depth d + 1 from cp; 
X{v2,W2) ,V2 & T \ Tu, V2 at depth d — 2 from cp, 

W2 at depth d — 1 from (p) G B} 
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(b) 



Fig 4. The edges involved in events A (a) and i~^{A) (b) are shown with arrow heads. 
Here d = 3. The vertex with a filled circle is the root, and the vertex with an unfilled circle 
is the second distinguished vertex 



Figure 4 shows the edges involved. It is clear that the random variables 
considered above are independent with distribution F*. Consequently the 
measure of the set i~^(A) equals the measure of A. This completes the proof. 
Note that we have used here the simpler notion of involution invariance 
described in Section 5 rather than spatial invariance as used in [4]. □ 

6.3. Evaluating the cost. In the following Theorem we evaluate the cost 
of the edge-cover Copt on the T. For obvious reasons, the expectation is twice 
the right-hand side of (6). 

Theorem 7. 



Proof. Denote by D the event that ^ (0, > X{(j),v) for all v ^ (j). 
Under the event there is only one vertex in Copt (</>), say y. By Lemma 2, 
y is the only neighbour of 4) satisfying ^ (i;^, y) < X{(j), y)+X{y, (p). Also, from 
(16), X{y,(f)) > 0. Conversely, if there is a neighbour y of that satisfies 



(i) X{y, </.) > 0, (ii) e (<^, y) > X{^, y), and (iii) ^ (0, v) < X{cP, y) + X{y, 0) 



then, from (16), we have 



< X{y, 0) = min {(e (<^, v) - X(</., v))+ ,v ^ <t),v ^ y] 



which implies ^ ((/>, v) > X((/>, v) for every v ^ (j),v ^ y. This and (ii) together 
imply that the event D holds, and Copt(</') = {y}- 
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Now fix a ^ > 0, and condition on the event that there is a neighbour Vz 
of with ^ {(f), Vz) = z. Call this event Ez- If we condition a Poisson process 
to have a point at some location, then the conditional process on removing 
this point is again a Poisson process with the same intensity. This shows 
that under Ez, X{(j),Vz) and X{vz,(j)) both have the same distribution F*. 
Also they are independent. Using these facts and the characterization of 
the event D in the previous paragraph, the expected cost under D can be 
written as 



Id 



veCopt{4') 




oo 



zP{X{vz,^)>0,z> X{(t>, Vz),z< X{(t>, Vz) + X{vz, 4>)}dz 



z=0 




■z 



}dF,(x)) 



(19) 



+ 



zP{X{vz, (j)) > z - X 



dz 



x=0 




= W{1){1 - W{1)) + 2W{1)'^ 
= W{1) + W{1)'^. 



In the second equality above, we condition on X{(f),Vz) = and X{(f),Vz) = 
X G (0, z) respectively in the two terms of the integrand. 

Under the event D'^, Copt{<p) contains all v for which C{(p,v) < X{(f),v). 



20 



KHANDWAWALA AND SUNDARESAN 



(20) 



The expected cost over this event is given by 

f eCopt{0) 

c v) i{5(^,t,)<x(0,t>)} 

V -' 
^ E (0, V) l{^(^,v)<X(<l,,v)}] 

V 

I'CO 

V/ V{i{cp,v)>y,i{<P,v)<X{<P,v)}dy 

POO 

/ y2F{y<C{^,v)<X}dy 



{X is a -distributed r.v. independent of the Poisson process) 
E [Number of Poisson points in [y, X]\dy 



y=0 

oo 



j/=0 
oo fCO 



poo 

/ E[(X-y)+]dy 

Jv=0 

F^{x)dxdy 
VF(l)e-^'dxdy 



y=0 Jx=y 
oo roo 



j/=0 J x=y 



W{l)e-ydy 

y=0 

= W{1) 

Combining (19) and (20) completes the proof. 

In passing, we remark that Copt(<A) is finite almost surely. 



□ 



6.4. Optimality in the class of involution invariant edge-covers. We now 
show that our candidate edge-cover Copt has the minimum expected cost 
among involution invariant edge-covers on the PWIT. 

Theorem 8. Let C be an involution invariant edge-cover of the PWIT 
T. Write C{(l)) for the set of vertices ofT adjacent to the root (f) in C. Then 



E 



■v(iC{(t>) 



> E 
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Let us first set up some notation that will simplify the proof steps. For 
each directed edge {v,w) of T, define a random variable 



(21) Y{v, w) = min{|:(^ {w, y) - X{w, y))\^^^^^^l^,l 



where A'^ is the set of neighbours of w. It is easy to see that the random 
variable can be written as 



Y{v, ui) 



miny^^^y^^ {w, y) - X{w, y)} 

if ^ (w, y) — X{w, y) > for all y ^ w,y ^ v 

(C {W, y) - X{w, y))l{^{w,y)-X{w,y)<0} 

otherwise. 



Note that {Y{v,w))+ = X{v,w). 

Suppose that E [^^^^^^^ ^ ((^, f )] < oo. Then C{(j)) is a finite set with 
probability 1 because ((/>, f ) , u ~ 0} are points of a Poisson process of 
rate 1. For such an edge-cover C, define 



(22) 



A{C)= X{^,v)+ max Y{v,, 



The max operation in the above equation is over an infinite number of 
vertices; however, in the remark after the proof of Lemma 4, we will show 
that effectively Y{v, (p) assumes only finitely many values as we vary f , and 
hence the max operation as well as A{C) are almost surely well defined. 
The following two lemmas will be used to prove Theorem 8. 

Lemma 4. Let C be an edge-cover rule on the PWIT such that 



E 



■vec(4,) 



< oo. 



Then almost surely, 



Furthermore, 



C{cp,v)>A{C). 

J2 H^,v) = A{Copt). 
veCopt{4>) 
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Lemma 5. Let C be an edge-cover rule on the PWIT such that 

E 



< oo. 



If C is involution invariant, we have E [A(C)] > E [^(Copt)] • 

Proof of Theorem 8. If E [X]i,6C(</>) ^ ('^' ^)] = °° statement of the 
theorem is triviahy true. Assume that it is finite. We are now in a position 
to apply Lemmas 4 and 5 as follows to get the result: 



v(iC{<f>) 

> E 
= E 



AiO 



AiCopt) 



(Lemma 4) 

(Lemma 5) 

(Lemma 4) 



□ 



Let us now complete the proofs of Lemmas 4 and 5. 
Proof of Lemma 4. From (21), we have 

Y(v,<P)<Y,i^(<P,y)-X(<p,y)) 

ydA 

for all A C N^j, \ {u}, A nonempty. 

For any v ^ C((/)), we can choose A = C((j)) to obtain 

Y(v,^)< (i(ct),y)-X((t>,y)). 

y&C{ct>) 



This implies 
(23) 



max Y(v,4>)< Y (i((k,y)-X(ct>,y)). 
.^CW,.~0 ^^^^^ 



Thanks to the finite expectation assumption in the lemma, C((j)) is a finite 
set almost surely, and so 'Yl,yeC{<i>)-^^'^^y) finite. Rearrangement of (23) 
then yields 

Y i(<t>:V)>A(e). 

vec(cf>) 
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Now recall the alternate characterization of Copt via 

(24) Copt(if) = argmin<^ Y^(^(tt;,y) - X{w,y)) : A C Nu,,A nonempty >. 

From (21) and (24), for any v ^ Copt(0), we have 

(25) Y(v,c\>)= J2 {H<t>,y)-Xi4>,y)), 

yeCopti<f>) 



and hence 



max Y{v,(t>)= V (^ (</>, y) - X((^, y))- 
3/eCopt (<p) 

It follows by rearrangement that 

C{<P,v) = A{Copt)- □ 

veCopt{4>) 

Let us quickly reassure the reader that the max operation in (22) is well- 
defined. Notice that (25) implies that Y{w, (p) takes values in the finite set 

{ E {H^,y)-Xi^,y))}\J{Y{v,cP)\vGCoA'P)}- 

y&Copt{<f>) 

That Copt(</>) is finite (almost surely) can be gleaned from Theorem 7. This 
validates the assertion that the max in the definition of A{C) is well defined. 

Proof of Lemma 5. Define 

(26) A{C)= X{v,cl>)+ max Yiv,cP). 

We will prove Lemma 5 by showing the following two results: 

(a) For an involution invariant edge-cover C 

(27) E[AiC)]=E[AiC)]. 

(b) Almost surely, 

(28) i(C) > i(Copt). 
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We first prove (27). First, by involution invariance of C, we have 



(29) 


E 






= E 

















Indeed, tlie left hand side equals 

/ Y,X{<l>,v)dfic{[G,4>]) 

where is the probability measure on Q^, corresponding to C. By involution 
invariance, this equals 

which is equal to the right hand side of (29). Thanks to the finite expec- 
tation assumption of the lemma, we saw in the proof of Lemma 4 that 
max^^(j(^) Y{v, (j)) is finite almost surely. Now observe that A{C) (respec- 
tively A(C)) is obtained by adding the almost surely finite random variable 
max^^(j(^) ,^^0 1^(1;, 0) to the random variable which is the argument of the 
expectation in the left side of (29) (respectively the right side of (29)). Taking 
expectation and using the equality in (29), we get (27). 

Now we will prove (28). First condition on the event Li = {\Copti4>)\ > !}• 
Observe that, under Li, ^ {(j),y) — X{(f)^y) < 0, y ^ cj) and only if y G 
Copt(<^), and there are at least two such y. Then, by (16), 

(30) X{v,(p) = Ofor allv (p. 

Also, from (21) and (24), 

Y{v,(t>)> J2 {H<t>,y)-X{<P,y))=Y{w,^) 

if w ^ Copt (</*)• This implies 

Y{v, (j)) > max Y{w, (p) for all f ~ 0. 

In particular. 



(31) 



max Y(v,(p) > max Y{w,(f)). 
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Combining (30) and (31) gives 
(32) Yl ^(^''^)+ ,™ 



veC(<j>) 



> X(v,(j))+ max Y(v,6). 

feCopt(<p) 

Thus A{C) > i(Copt) under Li. 

Now consider the event L2 = {|Copt(0)| = !}• Let 

=min(e ((/>,?;) -X(0,t;)), and 

X(')=minf)(e(0,r;)-X(0,t;)), 

where min*^^) stands for the second minimum. 

Let Copt((/>) = {u}- Then X{u,(j)) = and for v G C((?;<) \ Copt(0), 

X(i;,</<) = (4'V. So we get 



(33) Yl ^(^''^)- E ^(^''^) 



item- 

t)6C(</.)\Copt(0) 

If ^ Copt(0), then Y{v,cP) = Also y(u,0) = X^^\ Since xj'^ > 

X^^\ we get 

and 

max Y(v,(fy=X^l^ . 

Therefore, 

" '(2) x.(l) 
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(2) 

Adding (33) and (34), and canceling l{u^c(0)}) get 



f gC(0) 



EX(v,(h)+ max Y(v,(j)) 

EX(u,(^)— max l^(w,(?i' 
^ l{n^C(<i!.)} 



i;eC(0)\Copt(</') 

>o, 

where the last inequality follows because there exists a v £ C{(p) \ Copti4>) by 
virtue of our assumption that C{(j)) ^ Copt{(p). Thus A{C) > A{Copt) under 
L2 as well. □ 

7. Completing the lower bound. In the previous section we de- 
scribed an edge-cover Copt on the infinite tree T. We showed that this edge- 
cover satisfies the expected property of involution invariance, and it has the 
minimum expected cost among all edge-covers having this property. We use 
this to show now that the expected cost of Copt serves as an asymptotic lower 
bound on the expected cost of min-cost edge-covers on Kn- 

Theorem 9. Let C* be the optimal edge-cover on Kn- Then 



lim inf E 



> 2W{1) + W{1) 



Proof. Take a subsequence {uk, k > 1} for which the lim inf above is a 
limit. Now consider the joint sequence {C*^, Kn^)k>i in G* x G*- Because 
Kn,. — > T, for every e > there is a compact subset /C of with 
P {Kru. G /C} > 1 — e for all k. Also, we can take the graphs Kn^ to be 
on a common vertex set V, and assume that all graphs in /C are defined 
on the same vertex set. Let £ denote the set of all possible edges. Let /C5 
denote the set {H is a subgraph of G|G G /C}. Since C*^ is a subgraph of 
Kn^, P {C*^ E ICs} > 1 — e for all k. An element of /C5 can be identified with 

an element of /C x {0, 1}^, where 1 or denotes the presence or absence of an 
edge respectively. Since the latter is a compact set, so is /C5. This shows that 
the sequence of random graphs {Cn^.}^^^ is tight. By completeness of Q^, 

we have that {(C*^ , K„j.), A; > l} is sequentially compact. Therefore, there 
exists a further subsequence {nj,j > 1} of {uk, k > 1} such that (C* ., Kn^) 
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converges in the local weak sense to {C*,T). Since the C* distribution is 
involution invariant, so is the distribution of C* . By Skorohod's theorem we 
can assume the convergence occurs almost surely in some probability space. 
By the definition of local weak convergence 

^ ^K„. i^^ ^) J2 ^T' ^) as n oo a.s. 

{M&C*^ ' v(iC*{<t>) 

By Fatou's lemma 



lim inf E 



E f 



> E 



By Theorem 8 and Theorem 7, 



E 



■v(iC*{<f>) 



> E 



2W{l) + W{lf 



This completes the proof. 



□ 



8. Belief propagation. To prove the upper bound on E Cn to complete 
the proof of Theorem 1 we will construct edge-covers on Kn , > 1 , with 
costs W{1) + W{lf/2 + o(l). This is achieved using belief propagation as 
described in Section 2. 

We follow the approach of [14] to prove Theorem 2. In this section we will 
show the convergence of the BP algorithm on the PWIT T, and relate the 
converged solution with the edge-cover Copt of Section 6. In the next section 
we show that the belief propagation on Kn converges to belief propagation 
on T as n — )• oo. 

8.1. Convergence of BP on the PWIT. In this section we will prove that 
the messages on T converge, and relate the resulting edge-cover with the 
cover Copt of Section 6. 

The message process can essentially be written as 

(35) X^+i {v, v) = min{ (^r {v, v.i) - (v, v.i))^], 

where the initial messages Xij-{v,v) are i.i.d. random variables (zero in the 
case of our algorithm; see (2)). 

By the structure of T it is clear that for a fixed A: > 0, all the messages 
Xj- {v,v) ,v & V share the same distribution. Also, it can be seen from the 
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analysis of RDE (13) in Section 4 that if we denote the complementary 
cdf of this distribution at some step k by F, then after one update the 
complementary cdf is given by the map 



TF{y) 



-^exp(-/o°^F(t)dt) ify>0 

if y < 0. 



The operator T thus defined on the space D of complementary cdfs of R- 
valued random variables has a unique fixed point given by (15). 

The following theorem shows that the fixed point F^, has the full space 
T> as its domain of attraction. In other words, irrespective of the initial 
distribution, the common distribution of the messages Xj- {v,v) ,v G V 
converges to the distribution as A; — )• oo. 

Theorem 10. For any F 

lim T^'F = F^. 

k—^oo 

Proof. For any y > and A; > 0, 

T^+^F{y) = e-y exp {- T''F{t)d?j . 

Thus for k > 1, T^F{y) = Cke~y, where Ck,k > 1 are nonnegative real 
numbers satisfying 



oo 



Cfc+1 = exp J CfcC dt 
It is easy to check that Ck — )• VF(1). Consequently, T^F ^ F^. □ 



8.2. Endogeny and bivariate uniqueness. We have established the con- 
vergence of the messages on T in distribution. We now ask for the joint 
convergence of the message process on the tree. In particular, the question 
is whether there is a limit process satisfying the requirements of Lemma 1. 

An important property of the limiting process that allows us to come to 
this conclusion is endogeny introduced in [5] . Endogeny is a property of the 
recursive tree process (RTP) that it is measurable with respect to the i.i.d. 
process (^,-, A',-),i e V. 

Definition. An invariant RTP with marginal distribution fi is said to 
be endogenous if the root variable Xff, is almost surely measurable with respect 
to the u-algebra 

cymum\iev]). 
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Endogeny is related to another property of the RTP termed as bivariate 
uniqueness again introduced in [5]. 

For a general RDE (14) write T : — )■ V{S) for the map induced by the 
function g. Let "P^^^ denote the space of probability measures on S x S with 
marginals in V. We now define a bivariate map T^^^ : V^"^^ — )• V{S x S), 
which maps a distribution fi^"^^ £ V^"^^ to the joint distribution of 

te(x«,i<j<iv))\ 
[g{C;{xf\i<j<N))) 

where {X^^\xf) j>i are independent with joint distribution /i^^^^ on 5 x S*, 

and the family of random variables {Xj^\ Xj^^)j>i are independent of the 
pair (e, TV). 

It is easy to see that if /u is a fixed point of the RDE then the associated 
diagonal measure := Law(X, X) where X ~ /Li is a fixed point of the 
operator 

Definition. An invariant RTP with marginal distribution ^ is said to 
have the bivariate uniqueness property if ii'^ is the unique fixed point of the 
operator T^^-* with marginals /i. 

Theorem 11 of [5] stated below shows that under certain assumptions 
endogeny and bivariate uniqueness are equivalent. 

Theorem 11 (Theorem 11 of [5]). Let S be a Polish space. Consider an 
invariant RTP with marginal distribution //. 

(a) If the endogenous property holds, then the bivariate uniqueness prop- 
erty holds. 

(b) Conversely, suppose the bivariate uniqueness property holds. If also 
T^^) is continuous with respect to weak convergence on the set of bi- 
variate distributions with marginals ji, then the endogenous property 
holds. 

(c) The endogenous property holds if and only if T^"^^ (^ (g) /i) ^ fi-^ , 
where ^i® ^ is the product measure. 

The following theorem establishes the endogeny of the edge-cover RDE. 

Theorem 12. The invariant RTP with marginal (with cdf F^) asso- 
ciated with the edge-cover RDE (13) is endogenous. 
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Proof. By Theorem 11(b) it is sufficient to prove bivariate uniqueness 
and continuity for the map r(2) : r(R+ X R+) P(R+ X R+), where 
R+ = [0,00) and r(2)(^(2)) is the distribution of 

X\ ^ /min,>i(e^-X,) + 
Yj \mmi>i{Ci-Yi) + 

where {Xi,Yi)i>i are independent with joint distribution ^^^^ on R^, and 
are independent of {S,i)i>i which are points of a Poisson process of rate 1 on 
R+. 

(2) 

To prove bivariate uniqueness, we have to show that if fil is a fixed point 

(2) 

of the above map (with marginals then X = Y a.s. (/u* ). By Lemma 1 
of [6] this is equivalent to showing X = Y = X AY . Let {Xi, Yi)i>i be i.i.d. 
with distribution fi^'^K The set of points V := {Xi,Yi))\i > 1} forms a 
Poisson process on (0,oo) x R^ with intensity dt fi'^\d{x , y)) at {t; {x,y)). 
Writing G{x, y) = P {X > x,Y > y} for x,y £ R+, we get 

G{x, y)=F {^i - Xi> x,(i-Yi> y, for aU i > 1} 

= P {No point of "P in {{t; {u, v)) : t — u < x 01 t — v < y}} 



(36) 







= exp ^ 


-r 




Jt=o 




exp ^- 




expf - 



dt- [ F{t-Xi<xort-Yi< y}dt\ 

Jt=x\Jy J 

I V{Xi>t-x oxYi>t- y}dt\ 

Jt=x\Jy J 



' t=xVy 

... + VF(l)e-(*-^) 

-F{Xi>t-x,Yi >t-y}^dt 
= e-'^'^y exp(-t^(l)e-^''^(e^ + e^)) 

exp( f P{Xi>t-x,Yi>t- y}dt \ . 

\Jt=xVy J 

From this, setting x = y, it is clear that G{x,x) = ce~^,x > 0, for some 
constant c. We now have to evaluate the constant. 

Observe that the only place where G{x, x) can be discontinuous (if at 
all) is at X = 0. As a consequence, with x = y and the change of variable 
z = t — X, we see that the integral inside the exponent in (36) is P{Xi > 
z,Yi > z)dz = J^P{Xi > z,Yi > z)dz = f^G{z,z)dz. With x = y in 
(36), and integrating, we find that 
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I.e., 



ce-'= = e-2^«. 



Since W{1) = e~^^^\ it can be seen that c = W{1) solves the above 
equation. Because G(0, 0) < 1, we have c < 1, and noting that the function 
X I—)- xe~^ is monotone increasing for < x < 1, we conclude that c = 
is the only solution. Thus G = F^^, i.e., X f\Y = X = Y . This establishes 
bivariate uniqueness. 

Now to establish endogeny it remains to prove the continuity hypothesis 
of Theorem 11(b). Note that we require continuity of the map r(2) only over 
the subset C 7^(R^) which contains probability distributions with both 
marginals equal to /i*. We need to show that for any ^u^^^ € "P* and a sequence 
{^ih\>i in V, such that /i^^^ A /i^^), we have T^'^\ii^n'^) ^ r(2)(/x(2)). 

Take a probability space (fi, J^, P) in which there are random vectors 
{X,Y) ~ /x^^) and a sequence of random vectors {{Xn,Yn),n > 1}, with 
{Xn,Yn) ~ fi'n^. Then {Xn,Yn) A {X,Y). By following the steps of (36), 
for x,y £ R+, we can write 

Gnix,y) = T^^\fil){{x,oo),{y,oc)) 

= e-^^^ exp(-T^(l)e-^^^(e^ + e*')) 



exp ( / P{Xn>t-x,Yn>t- y}dt) 

\Jt=xVy J 



I t=x\ly 

(37) = e"^^?' exp(-l^(l)e-^^2'(e^ + )) 



exp(/ P{(X„ + x) A(y„ + y) >t}di 

= e-^'^^exp(-W^(l)e-^^^(e^- + 6^)) 

exp (E [((X„ + x) A{Yn + y)-xV y)+]) . 

The same calculation also gives 

G(x,y) = r(2)(^(2))((:^,oo),(y,oo)) 
(38) = e"^^^ exp(-H^(l)e-^-^^(e^' + e^)) 

exp (E [{{X + x)A(Y + y)-xV y)+]) . 

Let 

Z^'y ■.= {{Xn + x)A(Yn + y)-xVy)+ and 
Z^'^ ■.= {{X + x)A{Y + y)-xV y) + . 

Now (X„,y„) A (X,y) implies that, for each (x,y), Z^'^ ^ Z^'?'. Now 

< Z^^y < X„ for ah n > 1. 
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Since EX„ = KX for all n > 1, by dominated convergence theorem, we 
have FiZn^ — t- EZ^'^ as n — )• oo. Consequently Gn{x,y) — t- G{x,y) for all 
x,y £ R-|_. □ 

8.3. Completing the proof of convergence of BP on the PWIT. With 
endogeny in hand, we conclude that given a realization of 7", almost surely, 
the resulting stationary configuration of the X process of Lemma 1 is unique. 
Also, the following Lemma will show that if the initial messages are i.i.d. 
random variables with the fixed point distribution then the message 
process (35) converges, and the limit configuration is unique (almost surely). 

Lemma 6. // the initial messages X^{v,v) are i.i.d. random variables 
with distribution /u* then the message process (35) converges in to the 
process X as k ^ oo. 

Proof. Consider the evolution of bivariate messages according to (35), 
starting from {Xj- (•) ,X(-)). The second component will remain unchanged 
because the X process satisfies (16). The distribution of {X^{-) ,X{-)) is 
/i* (8) /X* . We have 

Law (X^+i(.),X(.)) = T(2)(Law {x!^ {■) , X {■))) . 

Here T^^^ is as defined in Theorem 12. By Theorem 11(c), {X!j- {■) , X{-)) 
converges to {X{-) ,X{-)) in distribution as A; — oo. Since {Xj- — X)"^ < 
2(Xf )2+2X2, and E [2{X!^f + 2X'^] = 4 E [X'^], the dominated convergence 
theorem gives E [(Xf - Xf] as /c — )• oo. □ 

We now prove that if the initial values are i.i.d. random variables with 
some arbitrary distribution (not necessarily ii^,), then the message process 
(35) does indeed converge to the unique stationary configuration. Of course, 
the initial condition of particular interest to us is the all zero initial condition 
(2), but we will prove a more general result. 

The following lemma will allow us to interchange limit and minimization 
while working with the updates on T- 

Lemma 7. Let X^{v,v) be initialized to i.i.d. random variables with 
arbitrary distribution F on H^. Then the map 

7rf(w) = argmin|(^r(^^,^i) -^f(^'>'"))^| 
is a.s. well defined and finite for all k > 1, and 

sup P i max arg min| (^j- {v, v.i) — Xj- {v, v.i)) \ > io \ ^ as oo. 
fc>i I j>i "-^ ^ J 
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Proof. Fix k. If j G argmmj>;^ | (^7- {v, v.i) — Xj- {v, v.i))~^^ and j > 2, 



then 



Now 



C{v,v.j)-X!^{v,v.j) < (cT{v,v.l)-Xfj-{v,v.l)y . 



(39) 



P |e (v, v.j) - Xf {v, v.j) < (^er {v, v.l) - {v, v.l) 
<F{av,v.j)<Xf^{v,v.j)] 

+ P {e {v, v.j) - X^ {v, v.j) < ^ {v, v.l) - Xf {v, v.l)}. 

The updates are such that {Xj- {v, v.i) , i > l} remain i.i.d. and independent 
of the Poisson process Thus, the probabihty on the right hand 

side of (39) equals 

P {Ci < } + p < xl - xf } 

where {^i} is a Poisson process and Xf,X| are independent random vari- 
ables with same distribution as XJj- {v,v.l). Then 

^ P i j G arg min| (^^r (v, v.i) - X^ {v, v.i)^ | I 
j=2 I i>i J 

00 

(40) 

< j;p {e, < xf } + j;p {c, < x| - xf } 

= EXf + E|xf -X|| 

< 3EXf 

Prom the proof of Theorem 10 it follows that EX^ converges, and hence it 
is bounded. This proves that the argmin is a.s. finite and the probability 
in the statement of the lemma, being upper bounded by the tail sum of the 
left-hand side of (40), converges uniformly to 0. □ 

We are now in a position to prove the required convergence. 
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Theorem 13. The recursive tree process defined by (35) with i.i.d. initial 
messages converges to the unique stationary configuration in the following 
sense. For every v £ V 

X-Y{v,v.i) — > X{v,v.i) as k ^ oo. 

Also, the decisions at the root converge, i.e., P{7r^(0) ^ Copt{(t>)} — )• as 
k — oo. 



Proof. The proof is essentially identical to the proof of Theorem 5.2 of 
[14]. We present it here for completeness. 

Let F be the cdf of the initial distribution. Let Ot,t € R denote the t-shift 
operator on D, i.e. 9tF : x i— )• F{x — t). Since — )• F^,, and T'^F are of 
the form y i— t- CnC^ , y > for n > 1, for any e > there exists /c^ G N such 
that 

d^,F^ <T^^F < e,F^. 

By Strassen's Theorem, probability measures satisfying such an ordering 
can be coupled in a pointwise monotone manner. In other words, there 
exists a probability space E' = (il',^',P'), possibly differing from the 
original space E = {Q,,^ ,P), on which we can define a random variable 
X"" with complementary cdf T'^^F and two random variables X~ and Xj^ 
with distribution such that almost surely 

(41) X- -e<X^ <X+ + e. 

We now define over the product space ((^„gv ® ^ PWIT T and 
independent copies X^, X+)^gv of the triple {X~ ,X'^,X^). 

On 7", we look at the message process with three different initializations: 

X^' {v, v) = X- , X^' {v,v)= XI and (u, v) = X+ V^; € V. 

From the update rule (35) one can readily verify that the ordering between 
the messages is preserved in the following sense. For any u G V and A; > 0, 

{v, v)-e< {i, v) < Xif^^ iv, v) + e; 

Xj- [v,v) — e<Xj- {v,v) < Xj- {v,v) + e 

Now fix a v G V and observe that 

{X>^+^^ {v,v))k>o = {xY iv,v))k>o- 
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It follows that for every k > 

sup\\X^{v,v) - Xij-{v,v)\\L2 = sup \\X^^ {v,v) - X^^ {v,v)\\l2 

s,t>k s,t>k—kf 

<2 sup \\Xlf-^ (v,v) - X{v,v)\\L2 + 2e. 

t>k-kt 

From endogeny and Lemma 6, it follows that 

sup \\x!^^ {v,v) — X{v,v)\\i2 ^ as — )• oo. 

t>k-k^ 

Thus the sequence {X!j- [v , v)) k>o is Cauchy in L^, and hence convergent. 
Now, Lemma 7 allows us to interchange limit and minimization in (35) to 
conclude that the limit process has to be a fixed point of (35). By endogeny 
there is a unique stationary configuration a.s. on any realization of the 
PWIT. Hence the limit configuration has to be identical to the X process. 
Again by Lemma 7, for any e > 0, we can choose an io such that 

P{vrf(0) ^{l,2,...,io}} <e/3 

for all k > 1, and P{Copt('/') ^ {1,2, . . . ,io}} < e/3. Now, the convergence 
of XJj- to X implies that for k sufficiently large, when vr^(i;A) and Copt(i;^) are 
contained in {1,2, . . . ,io}, the probability that the two maps differ is less 
than e/3. This proves the second statement of the Theorem. □ 

9. Belief propagation on Kn- 

9.1. Convergence of the update rule on Kn to the update rule on T ■ 
We use from [14] the modified definition of local convergence applied to 
geometric networks with edge labels, i.e., networks in which each directed 
edge {v,w) has a label \{v,vS) taking values in some Polish space. For local 
convergence of a sequence of such labeled networks Gi, G2, . . . to a labeled 
geometric network Goo, we add the additional requirement that the rooted 
graph isomorphisms ^n,p satisfy 

lim XcSln^pivM) = ^G^{v,w) 

n— ^00 

for each directed edge (v,w) in Np(Goo)- 

Now we view the configuration of BP on a graph G at the A;**^ iteration as 
a labeled geometric network with the label on edge {v, w) given by the pair 

With this definition, our convergence result can be written as the following 
theorem. 
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Theorem 14. For every fixed k > 0, the k*'^ step configuration of BP 
on Kn converges in the local weak sense to the k*'^ step configuration of BP 
on T ■ 

Proof. The proof of this theorem proceeds along the hnes of the proof 
of Theorem 4.1 of [14]. 

Consider an almost sure realization of the convergence — )• T ■ 
Recall from Section 3 the labeling of the vertices of T from the set V. We 
now recursively apply multiple labels from V to the vertices of K^- Label 
the root as i;^. If f € V denotes a vertex x of K^^ then (f .1, t;.2, . . . , v.{n — 
1) denote the neighbours of x in ordered by increasing lengths of the 
corresponding edge with x. Then the convergence in (42) is shown if we 
argue that 

V {v, w\^E (v, w) A (v, w) and eV 4 (v) A ^^(t;) 

as n — )• oo. 

The above is trivially true for k = 0. Writing the update and decision 
rules as 



X^^(vu,v)= min | f £-7^ (v,u) - X'l^ (v,u)] \ and 

-^'^ u€{v.l,...,v.{n-l),v}\{w}i\ -^" i^n ^ ' V J 



u£{v.l,...,v.{n—l),v\ ^ ^ ' ■' 

we may try to use the convergence of each term on the right hand side 
inductively to conclude the convergence of the term on the left. This is not 
directly possible as the minimum is over an unbounded number of terms as 
n — )• oo. However the following lemma allows us to restrict attention to a 
uniformly bounded number of terms for each n with probability as high as 
desired, and hence obtain convergence in probability for each /c > 0. □ 

Lemma 8. For all v £V and k >0 

lim lim sup P i max arg min I f £t(F (v,v.i) — X^ (v,v.i)'\ \ > io\ = 0- 

io^oo n^oc [ l<i<n-l'-^ " ^ ^ } 

Proof. The proof is the same as the proof of Lemma 4.1 of [14]. The 
only thing to keep in mind is arg min is a set and we target the largest index 
— but the same proof applies. □ 
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9.2. Completing the upper bound - Proof of Theorem 2. By Theorem 13, 
'^t('^) ~^ Copt(0) as A; — )• oo. It follows that 



(43) 



J2 ^r{4>,v)^ ^ ^T{(t>,v) ask^oo. 



We now prove convergence in expectation. Observe that 
V E 4(0) ('A, v) - Xlf (0, v) < (Cr (</>, 1) - i^^ 1)) ^ < er (</<, 1) • 
By (35), (0, v) < ir {v, v.l). Thus, 

(44) V G vrf (</.) ^ ^^) < Ct (</>, 1) + (^', ^--l) • 

This implies 

ir (.4>, v) < (</>, 1) + XI ^) l{?r(0,i)<?r(</'.i)+?r(*,i-i)}- 

It can be verified that the sum on the right hand side in the above equa- 
tion is an integrable random variable. Equation (43) and the dominated 
convergence theorem give 



(45) 



lim E 



■f e7r^((/)) 



feCopt(0) 
2H^(l) + iy(l)2, 



where the last equality follows from Theorem 7. 

By Theorem 14 and Lemma 8, using the definition of local weak conver- 
gence, we have 



(46) 



fSTT* (0) i^e7r^(0) 



We now apply the arguments that lead to (44) to the edge-covers tt^ ((/>), 
and obtain 



u G vr- 



For any two vertices u, v of K„, define S'n(^t, f ) = min^^^^^ (^^^ (u, tf). Then 
for a vertex u of Kn, Cjf^ ("^i 1) ^ 5'„(i?!>, i;) and ^^^^ {v, v.l) < Sn{v, 4>). This 
gives 

f G 7r| (0) ^ i-j^ (0, t;) < 5^„(</., v) + 5„(?;, </)). 
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Consequently, 



ve-nh. ((/>) V 

Observe that {(j), v) , v), and Sn{v, (f) are independent exponential 
random variables with means n,n/{n — 2), and n/(n — 2) respectively. So 
we can write 



E 



K„ i^^^) '^{iT^J<l>,v)<S„i<t,,v)+S„(v,<t>)} 



^0 " 

3n^ — 5n 



n 



(n- 1)3 ■ 

Summing over all neighbors of (j), we get 



(48) 



E 



Sn? — 5n 
(n-l)2 ' 



which converges to 3 as n — )• oo. 

Using local weak convergence, we can see that 



V 

^ iT i<P, 1) + E (<^, ^) ^{^riM<M'^^)+M^M)}■ 



i>2 



It can be verified that the expectation of the random variable on the right 
hand side above equals 3. Using this with (46), (47), and (48), the generalized 
dominated convergence theorem yields 



(49) 



lim E 



E 



Combining (49) and (45) gives 



(50) 



lim lim E 



E f 

■venh^ (0) 



2Wil) + W{l)^. 
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The expectation in the statement of Theorem 2 can be written as 

1 



E 



(51) 



E 



■E 



Yl Yl 

Y^ E ^Kn ( 



WdTT— (v) 



E 



tievr^ {</>) 



In the first equaUty above we count the contribution of the edges of the cover 
incident at each vertex of K^- The factor of 1/2 appears because each edge 
in the edge-cover appears twice, once for each of its endpoints. The 1/n in 
the second equahty accounts for the scaling of edge-costs from Kn to Kn- 
The third equahty holds because the root (j) in Kn is chosen uniformly at 
random from the n vertices. (50) now completes the proof of Theorem 2. □ 

9.3. Completing the proof of Theorem 1. Applying the scaling in (51) to 
the optimal edge-covers in and Kn-, we get 



E Cn 



E 



Theorem 9 gives the lower bound 



liminf EC„ > W{1) 



W{lf 



By Theorem 2 for any e > 0, we can find k large such that 



lim E 

n— >oo 



E ^^"(^) 



This gives 



WHY 

limsupECn < W{1) + \' + e. 



Since e is arbitrary, we get the upper bound 



limsupECn < W{1) + 



W{1) 
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This completes the proof of Theorem 1. □ 

Observe that for any e > 0, there exist and such that for ah k > 
and n > Nf, we have 



E 



<M/(l) + ^ + e. 



Thus for large n the BP algorithm gives a solution with cost within e of 
the optimal value in iterations. In an iteration, the algorithm requires 
0(n) computations at every vertex. This gives an 0{Ke'n?') running time 
for the BP algorithm to compute an e-approximate solution. The worst case 
complexity of the edge-cover problem is O(n^), a result due to Edmonds and 
Johnson (1970); see [15, Theorem 27.2]. 

10. More results. Our main results for the edge-cover problem were 
the proof of the limit of the expected minimum cost (Theorem 1) and the 
means to obtain an asymptotically optimal solution using the BP algorithm 
(Theorem 2). The use of objective method as the proof technique allows us to 
obtain several auxiliary results about the structure of the optimal solution, 
through calculations for the edge-cover Copt on the PWIT. In this section 
we state and prove, as examples, results for the distribution of the degree of 
the root and the probability that the least cost edge at the root is part of 
the optimal edge-cover Copt • It is easy to show using local weak convergence 
and the results of Sections 8 and 9 that these quantities arise as limits of 
the quantities corresponding to the edge-covers vr^ . 

Theorem 15. 

P{|Copt(</>)| = l} = e-^«(l + T^(l)). 

For k>2, 

P{|Copt(^)| = ^} = e-^«^. 

Proof. As in the proof of Theorem 6, {{£,j,Xj),j > 1} is a Poisson 
process on R+ x R_|_ with intensity dzdF^:{x). 
From the definition of Copt ; 

P{|Copt(</')| = 1} = P{At most one point of {{(,j,Xj)} 

in {{z, x) : z — X < 0}} 
= e-^{l + A), 
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where 



PCG poo 

A= / dF4x)dz 

J z=0 J x=z 

POO 

= / W{l)e-'dz 

Jz=0 



'2=0 

= W{1). 



Thus, 



P{|Copt(</.)| = l} = e-^W(l + T^(l)). 



For k>2, 



P {|Copt(<^)| = k} = P{k points of {{Cj,Xj)} in {{z, x) : z - x < 0}} 
~ k\ 

~ k\ ■ 

Theorem 16. 

P {1 G Copt(<A)} = ^ + ^ - ^^'(1)' - 1- 

Proof. The event {1 G Co^t^cf))} equals the union of two disjoint events: 

(a) ^(,/.,l)-X(</.,l) <0, and 

(b) < e (<^, 1) - X{ct>, l)<i (0, i) - Xi<f>, i) for all i > 2. 
The probability of the first event is 

foo poo 

P{e(0,l)-X(<^,l)<O}= / / m{x)e-'dz 

J z=0 J x=z 

W{l)e-^e-^dz 

lz=0 

W{1) 



£ 



For the second event, write ^ {(f), i) = {<p, where obviously {^^, i > 2} 

is a rate 1 Poisson process independent of {X{(f>,i) ,i > 2}. For i > 2, 
e {<!>, 1) - X{(t), 1) < C (</., - X(<^, i) if and only if -X{^, 1) < - X(<^, z). 
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The probability of the second event can be written as 

P {0 < C ((/), 1) - X{^, 1) < e (0, i) - X{(P, i) for ah i > 2} 
= / PjNopointof {(e:,X(</.,i),i>2)} 

J 2:1=0 J Zl=Xl 

in {{z,x) : z — X < — xi}}e~^MzidF*(a;i) 
= / e-^^exp -/ / dF,(x)dz dF*(xi) 

^2:1=0 \ J z=0 J x=z+xi / 



00 

Xl 



e 

2:1=0 
00 



exp ^- ^ VF(l)e-^e-^idz^ dF,(xi) 

;>oo 

/ e-^iexp(-M^(l)e-^i)dF,(xi) 
^2:1=0 

/>oo 

W{1){1-W{1))+ / T^(l)e-2^'iexp(-Ty(l)e-''i)dxi 
Ai=o 



W{lf-1. □ 



11. Summary. In a nutsheU, we have implemented Aldous's program 
based on [4] to solve the random edge-cover problem. Aldous's program 
serves as a rigorous mathematical alternative to the cavity method applied 
to mean-field combinatorial optimization problems. Aldous and Bandyopa- 
dhyay [5, Sec. 7.5] outline the steps of this rigorous methodology, highlighting 
the role of RDEs and endogeny. See below. 

But first, we must indicate another way in which the complete graph with 
i.i.d. edge weights arises. Combinatorial optimization problems involving 
n random points on M"' are of interest in many physical settings, but are 
typically difficult to analyze because of dependence of the random variables 
representing the (2) distances. A more tractable mean-field model ignores the 
underlying d-dimensional space, and simply models the interpoint distances 
as i.i.d. random variables. This resulting model is then the complete graph 
on n vertices with i.i.d. edge weights. The case of exponential mean 1 edge 
weights models the d = 1 setting. There are other distributions to model the 
d > 1 settings. Though we did not deal with d > 1 in this paper, we expect 
the extension to hold (as for matching). 

Let us return to Aldous's program, as summarized by Aldous and Bandyo- 
padhyay [5, Sec. 7.5] and reproduced below. 
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"Start with a combinatorial optimization problem over some size-n random 
structure. 

• Formulate a "size-oo" random structure, the n — > oo limit in the sense 
of local weak convergence. 

• Formulate a corresponding combinatorial optimization problem on the 
size-oo structure. 

• Heuristically define relevant quantities on the size-oo structure via ad- 
ditive renormalization . . . 

• If the size-oo structure is treelike (the only case where one expects 
exact asymptotic solutions), observe that the relevant quantities satisfy 
a problem dependent RDE. 

• Solve the RDE. Use the unique solution to find the value of the opti- 
mization problem on the size-oo structure. 

• Show that the RTP associated with the solution is endogenous. 

• Endogeny shows that the optimal solution is a measurable function of 
the data, in the infinite-size problem. Since a measurable function is 
almost continuous, we can pull back to define almost-feasible solutions 
of the size-n problem with almost the same cost. 

• Show that in the size-n problem one can patch an almost-feasible so- 
lution into a feasible solution for asymptotically negligible cost." [5, 
Sec. 7.5] 

The size-n random structure is the complete graph on n-vertices Kn 
with independent exponential mean-n edge weights. The following points 
elaborate on how we addressed the steps above. 

• The size-oo random structure is the PWIT. 

• The corresponding optimization problem on the size-oo structure is 
simply the minimum cost edge-cover on the PWIT. While this step 
is easy for the edge-cover problem, in general some subtleties are 
involved. For example, the limiting size-oo problem for Frieze's size-n 
problem of minimal spanning tree on Kn [8] is a minimal spanning for- 
est with certain requirements on the included edges. See [3, Defn. 4.2] 
for details. 

• We then heuristically provided the quantities relevant to the edge- 
cover problem on the PWIT in Section 4. The additive renormalization 
measured the reduction in cost arising from the relaxation of the 
requirement that the root be hit. 

• Using the tree structure of the limiting object, we obtained the RDE 
(13) associated with the edge-cover problem. 

• We solved the RDE in Theorem 6, showed that it had a unique solu- 
tion, and found the value of the optimization problem on the PWIT in 
Theorem 7. Another important step is Theorem 8 which proves that 
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the edge-cover Copt, based on the heuristic relation (10), is optimal 
among involution invariant edge-covers on the PWIT. Our method 
for establishing this nontrivial step may have some bearing on other 
similar combinatorial optimization problems. This step eventually es- 
tablished a lower bound for the liminf of size-n optimal values. 

• Theorem 12 established endogeny of the RTP associated with the 
solution of (13). Theorem 2 corresponding to the BP algorithm on 
Kn replaces the procedure of Aldous's program for obtaining solutions 
of the size-n problem from the solution of the size-oo problem. The 
key steps for this are based on Salez and Shah's approach [14] and 
is as follows. Using endogeny, we argued that BP (with i.i.d. initial- 
izations) converges to the RDE-based stationary configuration on the 
PWIT. We then established that, at a particular node of the BP 
update for large n depends essentially only on messages from its local 
neighbourhood (Lemma 8). This is then used to express BP on the 
PWIT as the limit of BP on Kn- The BP iterates on Kn were then 
the candidate solutions for the size-n problem. 

• No corrective patch-up was needed for the size-n problem, since at 
each iteration of the BP algorithm, every vertex was covered by the 
corresponding selection of edges. Simple dominated convergence argu- 
ments then established the convergence of the expected optimal costs 
to the correct value. 

It is worth noting that the upper bound result in Theorem 1 can be 
obtained via a simpler proof of Theorem 2 for a version of BP algorithm, 
where the messages are initialized as i.i.d. random variables from the fixed- 
point distribution F^. In this case Lemma 6, which follows from endogeny, 
establishes the convergence result on the PWIT. The more general result 
of Theorem 13 shows that BP works when messages are initialized as i.i.d. 
random variables from any arbitrary distribution. 

Finally, we must mention that Aldous [4] proved a strong property called 
asymptotic essential uniqueness for matching, which is roughly the property 
that if a matching on Kn is almost optimal, then it coincides with the 
optimal matching except on a small proportion of edges. The question of 
whether this property holds for the edge-cover problem is one that we hope 
to address in the near future. 
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